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Abstract. We provide a probabilistic analysis of the output of Quicksort 
when comparisons can err. 



1. Introduction 

Suppose that a sorting algorithm, knowingly or unknowingly, uses element com- 
parisons that can err. Considering sorting algorithms based solely on binary com- 
parisons of the elements to be sorted (algorithms such as insertion sort, selection 
sort, quicksort, and so on), what problems do we face when those comparisons are 
unreliable? For example, ,6, gives a clever O (e~^ logn.) algorithm to assure, with 
probability 1 — e, that a putatively sorted sequence of length n is truly sorted. But 
knowing the structure of the ill-sorted output would likely make error checking eas- 
ier. Also, in situations in which a reliable comparison is the fruit of a long process, 
one could chose to interupt the comparison process, thus trading reliability of com- 
parisons (and quality of the output) for time. As a first step in order to understand 
the consequences of errors, we propose to analyze the number of inversions in the 
output of a sorting algorithm (we choose Quicksort ^|) subject to errors. 

We assume throughout this paper that the elements of the sequence 

X = ixi,X2, . . . ,X„) 

to be sorted are distinct. We assume further that the only comparisons subject to 
error are those made between elements being sorted; that is, comparisons among 
indices and so on are always correct. Errors in element comparisons are random 
events, spontaneous and independent of each other, of position, and of value, with 
a common probability p, n being the length of the list to be sorted. The number 
of inversions in the output sequence y = {yi, y2i ■ ■ ■ lUri) is denoted 

I{y) = #{(«> i) \ <i < j <n and yt > yj} . 

We assume that the input list is presented in random order, each of the n\ random 
orders being equiprobable. Finally we denote by I{n,p) the random number of 
inversions in the output sequence of Quicksort subject to errors. 
Our result is, roughly speaking, 

I{n,p) = e (n^p) , 

when {n,p) (oo,c), meaning that ^-^j^ converges to some nondegenerate prob- 
ability distribution. The "surprise", not so unexpected after the fact, is that there 
are phase changes in the limit law, depending on the asymptotic behaviour of (ri,p). 
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The organization of this paper is as follows: The results are stated in Section 
El In Sectional we establish a general distributional identity for I{n,p). In the 
remaining sections, we prove convergence results for I{n,p) when: 

• p^c, 0<c<l, 

• p vanishes more slowly than 

• p ^ \/n where A is a positive constant. 

The case np ^ is different and not treated in detail; see Remark l2.1UI In Sectional 
we establish a general result of convergence using contraction methods (cf. .14 , 1,5. ), 
and we use it in Section [S] for the first two cases. These methods do not apply for 
Case 3, which requires poissonization (see Section [S] where we use an embedding 
of Quicksort in a Poisson point process). 



2. Results 

Set 

_ I{n,p) 
n''p 

We will always let U denote a random variable that is uniformly distributed on 
[0, 1]. Also, N* shall denote the set of positive integers, and N the set of nonnegative 
integers. 

Case 1: limp = c > 0. 

Theorem 2.1. // limp = c, c G (0,1], then Xn^p converges in distribution to a 
random variable X^ whose distribution is characterized as the unique solution with 
finite mean of the equation 

(1) Xc '= [(1 - 2c)U + cfXc + [(2c -1)U+1- cfXc + T(c, C/), 

in which Xc denotes a copy of Xc, (Xc, Xc,U) are independent, and 
Tie U) = i^(C/2 + (1 - Uf ) + cU{l - U). 

Furthermore, 

- r 

E \Xr] = 



Var {X,) 



2(l + 2c-2c2)' 

(1-C)2(1-2C)2 



4(1 + 2c - 2c2)2(3 + 6c - 8c2 + 4c3 - 2c4) ' 

As usual with laws related to Quicksort, see e.g. ^^Eli ""-U is approximately 
the position of the pivot of the first step of the algorithm. As in standard Quicksort 
recurrences, the coefficients of Xc and of its independent copy Xc are related to 
the sizes of the two sublists on the left and right of the pivot, sizes respectively as- 
ymptotic to n ((1 - 2c)U + c) and n ((2c - 1)C/ + 1 - c). The toll function T{c, U) 
is approximately {n'^p)~^ ~ {n^c)^^ times the number of inversions created in the 
first step: c(l — c)ri}\j'^ jl is approximately the number of inversions of the cnU el- 
ements, smaller than the pivot but misplaced on the right of it, with the (1 — c) nU 
elements smaller than the pivot, that are placed, as they should be, on the left; 
c2n2j7(l — U) is the number of inversions between misplaced elements from the 
two sides of the pivot. The toll function T(c, C/) depends on only one of the two 
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sources of randomness (the randomly ordered input list, and the places of the er- 
rors), viz., the first one, through U. The second source of randomness is killed by 
the law of large numbers: in the average, each of the cnU + o(n) misplaced numbers 
from the right of the pivot produces inversions with one half of the (1 — c)nU + o{n) 
elements smaller than the pivot, that are placed, as they should be, on the left. As 
opposed to the other values of c, the choices c = 0.5 and c = 1 lead to deterministic 
Xc = 1/2, without any surprise : for p = 0.5 the output sequence is a random 
uniform permutation, with a number of inversions concentrated around |1(JI 
Chap. 5.1.1]; for p = 1 the output sequence is decreasing, and has n{n — l)/2 
inversions. 



Case 2: p vanishes more slowly than i. 

Theorem 2.2. //limp = and limnp = +00, Xn^p converges in distribution to a 
random variable X whose distribution is characterized as the unique solution with 
finite mean of the equation 

(2) X U^X + (1 - UfX ■ 

In (0), X denotes a copy of X and {X,X,U) are independent. Furthermore, 

E[X]^1 and Va.r{X)^^. 

Note that equation Q is just (Q) speciahzed to c = 0, but, as opposed to c 7^ 0, 
an additional condition, p 3> 1/n, is needed to ensure that the law of large numbers 
still holds. Also, as another difference between and (O, for p 1 the errors do 
not change the sizes of the sublists in a significant way. The solution X equals half 
the sum of the squares of the widths of the random intervals [Ykj, Ykj+i] defined 
by below. This is equivalent to the following statement: 

Proposition 2.3. The solution X equals half the area Z{t) dt under the FIND 
limit process Z . 

For both these claims, see Remark 12.71 The Find process was introduced in [7] and 
is pictured at Figure ^ 



Case 3: lim np — A. 

Assume that 

• n is a Poisson point process with intensity A on N* x [0, 1], meaning that, 
for each n, |n n {{n} x [0, 1]) | is a Poisson random variable with mean 
A, and the second coordinates of points of 11 are uniform on [0, 1] and 
independent (see P| for a general definition of Poisson point processes); 

• {^k,j : A: > 0, 1 < J < 2*^} is an array of independent uniform random 
variables on [0, 1], independent of 11; 

• the random variables (Ifc.j, fc > 0, 1 < j < 2*^) are defined recursively by 

Yn,o = 0, ^0,1-1, Yk+i.2j^Ykj forO<J<2^ 
^' Yk+i,2j-i = {l-Uk^j)Yk,,-i+Uk^jYk,, forl<j<2^ 

• for X £ [0, 1], Jk{x) = 2j - 1 if Yfc-i,j-i <x < Yfc-ij, 
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Figure 1. The Find process, 
and define, for A > 0, (the sum is a.s. finite by Lemma r6.4|l 

(4) ^(^) = 1 E 

The variables Yk.j describe a fragmentation process (see |Jj for historical refer- 
ences): we start with [0,1) and recursively break each interval into two at a random 
point (uniformly chosen). In the fc-th generation we thus have a partition of [0, 1) 
into 2'' intervals Ik.j, 1 < j < 2'^, with Ikj = [Y^.j-i, Ykj)- The interval of genera- 
tion k — 1 that contains x is cut at step k at the point Yi^- j^ (j:) • Hence | x — Y^ j^ (j.) | 
in Q is the distance from x to this cut point. 

Theorem 2.4. // limp — and \imnp = A > 0, then Xn,p converges in distribu- 
tion to ^(A). The family {X(A)}a>o of random variables satisfies the distributional 
identity: 

(5) X{X) U^X{XU) + (1 - U)^X{X{1 - U)) + e(A, [/), 

in which, conditionally given that U = u, X{XU), X{X{1 — U)) and Q{X,U) are 
independent, X(XU) and X{X{1 — U)) are distributed as X{Xu) and X{X{1 — u)), 
respectively, and 

e(A,u) 

i—l 

in which N\ is a Poisson random variable with mean X, the random variables Vi are 
uniformly distributed on [0, 1], and Nx, and the Vi 's are independent. Furthermore, 

(6) E[X(A)] = 1, Var(X(A)) = l + i^. 

Remark 2.5. The distributional identities QJ, Q and © really are equations 
for distributions, but it is more convenient to state them for random variables 
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as done here. For © to make sense, i.e. in order to insure, for instance, that 
X{XU) is a random variable, it is imphcitly assumed that the random variables 
X{X) depend measurably on A. Thus a solution of © is a family of probability 
measures fi = (/^a);^>o ['-'' ^"°^)' such that there exists a family Y — (Y x^q 
of random variables defined on the same probability space (il. A, P) and satisfying 
the following properties: 

(i) for A > 0, /xa is the distribution of Y{X), 

(ii) y is a measurable process jS| Chap. 1], meaning that the mapping 
{X,uj) Y{X,uj) : ((0,+oo) X n, B{{0, +00)) (g) A) ^ ([0, +00), B([0, +00))) 

is measurable, 

and such that |SJ| holds for Y. A measurable version of the stochastic process 
X = (X(A))^^Q is defined at (jSJ below (measurability follows from [Sj Rem. 1.14]). 

For uniqueness, we need extra assumptions: let A4 denote the class of families 
of distributions /i = {^J'\)\yQ satisfying (i) and (ii) above, plus the condition: 

(iii) for some a E (0, 1), the function 

A — > A"E [Y{X)] 
is bounded on any bounded interval of (0, +00). 
Let denote the distribution of X{X). We have 

Theorem 2.6. The family v = {v\)x^q is the unique solution of (j^J in M. 

We do not know whether the extra assumption (iii) is necessary. Let us comment 
further on equation Writing Efe = {x : {k,x) £ H} and Hkj = Ilfc n Ikj, we 
can thus rewrite Q as 

00 2*= 

(7) ^(^) = aEE E 

k=i j=i xeiikj 

where Xkj is either the left or right endpoint of I^.j (depending on whether j is 
even or odd). 

Note that, conditioned on the partitions {Ik,j}, i-e. on {Ykj}k,j, each Ilkj is 
a Poisson process on Ik.j with intensity A, with the processes Ilkj independent. 
Since only the distribution of X{X) matters, we can by this conditioning and an 
obvious symmetry of the Poisson processes Ilkj just as well let Xk.j in Q be the 
left endpoint of Ikj for every k and j. 

Let n' be a Poisson process on (0,1] x (0, 00) with intensity 1, and let £^{t) = 
'^{xy)£n' y<t^^ t > 0. (This is a pure jump Levy process with Levy measure 
l(^Q ^dt.) Let ^^'^'^^t) be independent copies of this process, independent of {Ykj}. 
A scaling argument shows that (0 can be written 

00 2* 

(8) ^(A) = 3^5]^|4,,|e^'='^")(A|4.,,|). 

fe=i j=i 

Remark 2.7. Let X = l^fcjP- Then X satisfies (g}, so that X 

is the limit variable X in Theorem 12.21 {X is a.s. finite and has finite mean by 
Lemma 16. Il l Moreover, the FIND limit process Z in ^ is defined by Z{t) = 
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Er=i 1 \h,j I ; hence Z{t) dt = Er=i i \h,, P = 2X. This justifies 

Proposition 12.31 

Moreover, by the law of large numbers, E|A^^^(A) — 1/2| ^ as A ^ oo. It 
follows (by dominated convergence using Lemma that for the special version 
of X(A) defined at © 

E\X{X)-X\ 0, 
and hence X(X) converges to X in distribution as A — > oo. 

In this third case, we have a system of equations involving an infinite family of 
laws, and we could not adapt the contraction method: we rather use a poissoniza- 
tion. The phase transition from |(2Jl to (O is explained easily: instead of a number 
of errors 3> 1, we have now O (1) errors at each step, and the law of large numbers 
does not hold anymore for the number of inversions produced by step 1 . Actually 
the number Nx of errors at the first step is asymptotically Poisson distributed, and 
the N\ errors are at positions nVi, approximately uniformly distributed on [0, 1]. 
Thus, the number of inversions caused by this first step is approximately 

nJ2P-Vi\^ n^pe{\,U). 

i=l 

Remark 2.8. Actually we prove a stronger theorem in each of the three cases, as we 
prove convergence of laws for the Wasserstcin di metric ^13'. It entails convergence 
of the first moment. The convergence of higher moments is an open problem. 

Remark 2.9. As we shall see in Section 1^1 the distribution tail P(A'(A) > x) 
decreases exponentially fast f Theorem 16. 511 . 

Remark 2.10. When — + very slowly, that is {np)^^ <C logn, we conjecture 
that 2np\og{I{n,p) /n) converges in distribution to \ogU , with the consequence 
that •n}'~^ <C I{n,p) <C n, for any positive e. Actually, the main contribution to 
I{n,p) comes from the "first" error, in some sense. When {np)~^ ~ ^ogn, the prob- 
ability that no error occurs has a positive limit: we conjecture that, conditionally 
given the occurence of at least one error, the situation is similar to the previous 
case, that is, log {I{n,p)) / logn converges in distribution to a random variable with 
values in (0, 1). When [np)^^ :$> \ogn, F {I{n,p) = 0) ^ 1. 

Remark 2.11. Finally, we would like to stress that in the proof of convergence for 
one the three regimes considered in this Section, we have to deal simultaneously 
with any sequence converging to (+oo, c) according to this regime. This can 

be observed on the key equation Q, for instance, in which we would like to argue, 
roughly speaking, that if {n,p) is close to (+00, c) according to a given regime, 
then {Zn,p — 1, p) and (n — Zn^p,p) are also close to (+00, c) according to the same 
regime, with a large probability: here the same probability p is associated to three 
different integers, n, Zn^p — 1 and n — Zn^p, that denote the sizes of the input list, 
and of the two sublists formed at the first step of Quicksort, respectively. Thus p 
cannot be seen as a sequence indexed by n. In order to allow such a loose relation 
between n and p, filters turn out to be more handy than sequences (see |21 Chap. 
I]). Convergences in the three regimes are thus understood as convergences along 
the three corresponding filters (see Theorem 14. 2|l . 
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3. A DISTRIBUTIONAL IDENTITY FOR THE NUMBER OF INVERSIONS 

At the first step Quicksort compares all elements of the input list with the first 
element of the list (usually called pivot). All items less (resp. larger) than the pivot 
are stored in a sublist on the left (resp. right) of the pivot. Comparisons are not 
reliable, therefore si items that should belong to the left sublist are wrongly stored 
in the right sublist, and Sr items larger than the pivot are misplaced in the left 
subhst. 

Since its items are chosen randomly, the input list is a random permutation and 
the true rank of the pivot can be written \nU~\ , where U is uniformly distributed 
on [0, 1] and \x~\ is the ceiling of x. Also, conditionally given U, se (resp. Sr) is 
a binomial random variable with parameters ([j^C^] — 1, p) (resp. {n — \nU~\, p)). 
Quicksort with error is then independently applied on the left sublist i and on the 
right sublist r and new errors occur, ultimately producing two new sublists £ and 
f. Set 

Zn,p = \nU~\ - Si + Sr, 

SO that Zn^p — 1 (resp. n — Zn^p) is the size of £ and i (resp. r and f). 

In order to enumerate the inversions of the output list, we introduce a purely 
fictitious error-correcting algorithm that parallels the implementation of Quicksort: 
This fictitious error-correcting algorithm has two recursive steps, 

• First, the error-correcting algorithm corrects the sublists £ (resp. f) at costs 
L = I{£) (resp. R = I{r)), producing two increasing sublists £ and f. Note 
that L and R are conditionally independent, given Zn,p. Furthermore, the 
two sublists £ and r obtained at the end of Step 1 are in uniform random 
order before the second step of Quicksort, so that, conditionally given Zn^p, 
cost L (resp. R) is distributed as I{Zn,p — l,p) (resp. I{n — Zn.p,p))- 

• Then the error-correcting algorithm corrects the errors of Step 1, at a cost 
t{n,p) = /(£||pivot||f). Here i||pivot||f stands for the list obtained when 
one puts £, the pivot and f side by side. The number of inversions t{n,p) 
in the list €||pivot||f is analyzed in detail at the end of this section. 

These two steps lead to the following equation for I{n,p): 

(9) i^(»^,p) I{Zn^p - l,p) + I'{n - Zn,p,p) +t{n,p) 

where Zn^p = \nU~\ —si + Sr- We shall obtain the asymptotic distribution of t{n,p), 
and as a consequence ^ will translate, after renormalisation, into a distributional 
identity satisfied by the limit law of I{n,p) / {n^p). The limit law appears on both 
sides of the distributional identity, as expected, due to the recursive structure of 
Quicksort, and is thus characterized as the fixed point of some transformation. 

Description of t{n,p). At the end of the first step of the error-correcting algo- 
rithm, we obtain two subarrays £ and f, left and right of the pivot (cf. Figure OJ. 
They are sorted in increasing order but there are Sr (red) elements larger than the 
pivot just to its left and si (green) elements smaller than the pivot element just to 
its right. Thus, the only misplaced elements that the proofreader must correct in 
step 2 are clustered around the pivot. 

In order to sort the list, the red and green sublists must be exchanged. This 
requires sgSr + se + Sr inversions. We get therefore two unsorted lists £ and f 
each composed of two sorted sublists. All items of £ (resp. of f ) are now smaller 
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Figure 2 . The error-correcting algorithm. 



Errors 
Pivot 




elements less than the pivot 



elements larger than the pivot 



Figure 3. The two sublists £ and f. 



(resp. larger) than the pivot, so that the length of ^ (resp. of r ) is \nU^ — 1 (resp. 
n — \nU ] ) . It remains to sort £ and r , at respective costs and that are 
conditionally independent given U , leading to: 

(10) t{n,p) = seSr + se + Sr + Wf+ Wp. 
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A model for (W^, Wp) . Let Wm, be the number of inversions in a list of m elements 
sorted as follows: each element is painted black (white) with probability p (resp. 
I — p). Then the black and white sublists are separately sorted in increasing order 
and the two sorted sublists are placed side by side, producing a new list h with m 
elements. We have 

Proposition 3.1. Let Yi, . . . ,Ym be m independent Bernoulli random variables 
with the same parameter p, and let Sm = Yi + . . . + Ym . Then 



WJ'-^iY^iY, 



Proof. Let us abbreviate Sm to S. Among the Yi's, let l^j^, . . . ,Yig denote the S 
random variables equal to 1, lig^i, . . . , Y^^ those equal to 0, with ii < . . . < is 
and is+i < ■ ■ ■ < im- Now Wm can be seen as the number of inversions of the list 
i^j) i<:j<:m- order to move the numbers ij {j < S) to the correct position, the 
proofreader corrects inversions with each of the ij — j elements of {1, . . . , m} that 
are smaller than ij and do not belong to {zi, . . . , is}. Thus 

s 

(11) Wm=Y.('j-3)' 

J = l 

leading to the result. □ 

With the help of Proposition 13. II we can give a useful description of the distri- 
bution of {se, Wf-) and (s^, Wp): 

Proposition 3.2. Conditionally given that the length of £ is m — 1, (s^, W^) and 
(s,., VKp) are independent and distributed as {Sm-i,Wm-i) and (Sn-m^Wn-m), re- 
spectively. 

To sum up the results of this section, renormalizing |(^, one obtains a distribu- 
tional identity satisfied by X^y. 

in which 

(13) Zn,p = \nU^ - St + Sr, 



(14) A 



P 



(15) S„,. 

(16) t{n,p) = SiSr + Si + Sr + Wf+Wp, 



7 _ ^ ^ 2 

n 

_ 7 \ 2 



(17) r„,p 

and 



t{n,p) 
n'^p 



• [/ is a uniform random variable on [0, 1], and \nU^ is the position of the 
pivot, 

• conditionally given — m, {sg,W^ and (s^, Wf^) are distributed as in 
Proposition 
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. X = (X 



X = 



m>0 



are two independent sequences with 



the same (unknown) distribution, independent of (JJ, Si,W£, Sr,Wp), and 
therefore of {A^^p, B^^p, Z^^p, T^^p). 
The errors having a balancing effect: Zn,p = \nU~\ — si + Sr has the same mean, 
(n + l)/2, and a smaller variance than [rtf/]. We prove this in the following form. 

Lemma 3.3. 

E - If + (n - Zn,pf] < E [i\nU] - if + (n - \nU]f] = ^"'^^^^'''^^ 

2 2 

Proof. The left hand side is the expected number of ordered pairs («, j) that end up 
on a common side of the pivot. This happens if i and j originally are on the same 
side of the pivot and we either compare both correctly or make errors for both of 
them, or if they are on opposite sides of the pivot and we make an error for exactly 
one of them. Hence 

E - 1)2 + („ - Z„.p)2] = (p2 + {l-pf)E [{\nU^ - I f + {n- \nU\f] 

+ 2p(l - p)2E [( \nU] - 1) (n - \nU] )] 

= E[{\nU] -lf + {n- \nU]f] - 2p(l - p)E [([nC/] - 1 - (71 - \nU])f] 

which proves the first inequality. The rest is a simple calculation. □ 

Let us say that an element a of the list, or the comparison in which a plays the 
role of pivot, has depth k if a experiences fc — 1 comparisons before playing the 
role of pivot. We assume in this Section that any comparison with depth k + 1 
is performed after the last comparison with depth k. We call step k the set of 
comparisons with depth k, and we let l''-^\rL,p) denote the number of inversions 
created at step fc, that is, the total number of inversions, in the output, between 
elements that are still in the same sublist before step k, but are not in the same 
sublist after step k. We shall need the following bound: 



Lemma 3.4. For every k > 1, 



E 



< 



Proof. For fc = 1, I^^'>{n,p) = t{n,p), and a simple calculation yields 



E[t{n,p)]=p 



(n-l)(7i + l) 2(n-l)(n-2) ^ 1 



P 



< —n p. 



3 " 6 - 3 

For fc > 1 we find by induction, conditioning on the partition in the first step, 

fc-l , /„x k-1 



E 



I^^\n,p) 



< E 



{Zn,p - ifp + 



1 



(n - Zn,pfp 



and the result follows by Lemma [3.31 
Proposition 3.5. Set Un^p — E [X„^p]. Then 



□ 



Proof. By Lemma [3.41 an,p < J2i 



00 1 
2 



□ 
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4. Fixed point theorems 



The proofs of the first two cases are examples of the contraction method |14lll5j : 
on one hand we have more or less explicitly defined random variables An^p, 1 < i < 
I, and Tn^p, and we know how to prove directly that they converge to A'^^\ T. On 
the other hand, we have a family Xn^p of random variables defined by induction: 



I 

(18) Xn^p = y2 Al^^pX'-^ly +Tn,p, 

and a random variable X implicitly defined by the distributional identity 

(19) x'i.''^v4WxW+T, 

4=1 



(i) 

in which, in some sense, limZn'p = +oo. Then, under additional technical condi- 
tions, the convergence of the "coefficients" T„^p, entails the convergence of the 
"solution" Xn,p- One has to prove existence and unicity of the solutions, usually as 
fixed points of contracting transformations in a subspace of the space of probability 
measures, with a suitable metric. In the case we are interested in, (|18|l holds and: 

• / is a fixed positive integer; 

• Cn.p = {Anli, Zn}p, ■ ■ ■ , An}p, Zn}p, Tn^p) is a givcii random vector for each 
n,p; 

. zL%e [0,...,n-l]; 

• The families {Xnli)n,p, i = 1,2, . . . ,/, are i.i.d. and independent of Cn,p, 
and and Xn\, X^^p. 

Given such C„^p we thus define, for any distributions Gp.p, . . . , Gn-i^p, 

^{Go^p, . . . , Gn-l,p) = C A^^'pX^*',) ^ + Tn^p^ , 

when, as above, the families (-''^j,'p)fe,p, i — 1,2,...,/, are i.i.d. and independent of 
Cn,p, and further X^'^ has the distribution Gk,p- Thus H18|) can be written 

Gn.p — ^{Go,p, ■ ■ • , Gn-l,p)- 

For (|19|) we similarly assume 

• C = {A'--^^ , ■ • • , A^-^^ , T) is a given random vector; 

• the variables X^^\ i ~ 1,2,..., I are i.i.d. and independent of C, and 

X. 

Given such C we define 



when the variables X*^*-', i — 1,2,...,/ are i.i.d. with distribution F and indepen- 
dent of C. Then H19() can be written 

^{F) = F. 
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Let D be the space of probabihty measures /i on M such that /jg dfj,{x) < +00. 
The space D is endowed with the Wasserstein metric 

(20) ^inf \\X-Y\\, 

= \\F-\U)-G-\U)\l. 

in which F and G denote the distribution functions of /x and i/, (resp. G~^) 
denote the generahzed inverses of F and G and, as in previous sections, [/ is a 
uniform random variable 0]. Since (resp. G^^{U)) has distribution /i 

(resp. z/), the infimum is attained in relation l|2U|) . 

The metric di makes D a complete metric space. Convergence of £(X„) to C{X) 
in D is equivalent to convergence of X„ to X in distribution and 

limE[|X„|] =E[|X|]. 

Therefore convergence in D entails 

limE[X„] = E[X] . 

We refer to J3] for an extensive treatment of Wasserstein metrics. In what fol- 
lows, we shall improperly refer to the convergence of Xn to X in D, meaning the 
convergence of their distributions. Let us take care first of relation H19|) : 

Theorem 4.1. //^E < 1 and E [|T|] < 00, then ^ is a strict contraction 

4=1 

and fjgl) has a unique solution in D. 

Proof. Let {X, Y) be a coupling of random variables, with laws fi and v, respec- 
tively, such that 

E[\X-Y\]^diifi,j^). 
Let ((X(*),y(*)))^^^^^ be / independent copies of {X,Y). Furthermore, assume 
that C and ((X^*), F^*))) .^^ are independent. Then the probability distribution 
of 

/ / 
J2 + T, resp. J2 A^^'^y^"'' + T 

1=1 i=l 

is ^'(/x) (resp. ^'(i^)) and 

/ 

di(*(Ai), *(;/)) < ^E |X« 
1=1 

< di{fi,iy) ^E 

1=1 

Thus ^' is a contraction with contraction constant smaller than 1. Since D is 
a complete metric space, this implies that 5* has a unique fixed point in D, by 
Banach's fixed point theorem. □ 

We prove now a theorem which is a variant of those used by the previously 
cited authors: the difference is not deep, but here we deal with family of laws, 
not sequences, as we have two parameters, n and p. As a consequence, to cover 
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Theorems 12 . II and 12 . 21 it will be convenient in their proofs to consider convergence 
with respect to a filter on N x [0, 1], see [2 Chap. 1]. The collection of sets 

VN,e {?! > A^} X ([c - e, c + e] n [0, 1]), > 0, e > 0, 

is a basis for the filter !Fi corresponding to Theorem l2.1l while 

VN,e = {in,p)\0 < p < e, n> N/p] , iV > 0, e > 0, 

is a basis for the filter T2 corresponding to Theorem 12. 21 

Theorem 4.2. Suppose that H18(l holds for n > 1 and Xq,p — 0; i.e. Gn,p = 
<i>(Go,p, . . . , Gn-i,p) for n>l and Go,p = So, where Gn,p = C{Xn,p). If 
i) (E[Xn,p])^ .p is bounded, 



ii) ^e[|a(^ 

iii) T,,p^T, A« 



iv) hm^ E \Ai^^\ ; {Z^^p) i V 



= 0, \/V e T, 



then Xn^p converges in distribution to F, the unique solution of the equation '^(F) = 
F in D. More precisely, di{Gn,pT F) along T . 

We need a lemma before proving Theorem l4.2l 

Lemma 4.3. Assume that three families of nonnegative numbers (an.p)o<n,o<p<i; 
{bn,p)o<n,Q<p<i, and (7i,n,p, 0<n, 0<z<n, 0<p<l) satisfy the inequalities: 

ri-l 
1=0 

Let J- be a filter. Under the following assumptions: 

- a„_p is nonnegative and bounded, 

- for some F < 1 and some Vq G T, y{n,p) G V^, J2k=o lk,n,p < T, 

- limjp bn,p = 0, 

-yV gT, lim^ E/c.-(fe,p)^y 7fe,„,p = 0, 
we have 

lim On p — 0. 

Proof of Lemma \4.!^ The proof is a variant of the proof of |14[ Proposition 3.3]. 
Let M be a bound for a„.p, and let 

a = lim sup an,p. 

For any e > 0, let £ T he such that for {n,p) e V^, 

an,p < a + e. 

Then for (n,p) G VeOVo we have 

0^n,p — ^ ^ ^k.n,p^k,p ~t~ ^ ^ ^k,n,p^k,p ~^ ^n.p 

k:{k,p)fV, k:{k,p)eV, 

< M ^ 7fe,«,p + (a + e)F + 5„^p. 

k:{k,p)^V, 
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Taking lim sups, we obtain that for any e > 0, 

a < (a + e)r. 

Thus a < aT, and so a = 0. 

Proof of Theorem \4-S\ We can choose X^*) and the family (yX\^\ 
a way that 



0<fc,0<p<l 



□ 

m such 



E 



4:i - ^« 



di{Gk,p,F), 



and we can also choose the families (-'i'I*i)fc>o ) to 

V '^'P - / 0<i<I 



be i.i.d. Then 



di(G„,p,F) <E 

n- 

E 



i=l 



ra-1 



< 



fc=0 
n-1 



,j=i 



i=l 



14:' I 



6„ 



ith 



7 

6„,p = ^E[|4*)p-A«|X«] +E[|T„,p-r 



i=l 
I 

^e[|aW|i^(.,^, 



Let M be a bound for (E [^n,p])„ p, and set 

an,p — di(Gn,p, F). 
Let us check the assumptions of Lemma [4.31 

< a„,p < E [X„,p] + E [X] < M + E [X] 
for the second assumption of Lemma 14.31 

n-l I I 



limsup^7fc,„,p = limsup^^E l] = 

fc=0 i=l i=l 

limj(r6„ p = by assumption iii), as 

I I 



<i ; 



1 



finally 

fc s.t. (k,p)<$_V i=l 

Therefore di{Gn,p, F) vanishes along and the proof of the theorem is now com- 
plete. □ 

The following Theorem is folklore. It gives the means and variances in Theorems 
12 .11 and after some computations. 



QUICKSORT WITH UNRELIABLE COMPARISONS: A PROBABILISTIC ANALYSIS 15 



Theorem 4.4. Suppose that ^ holds, where 1^ [l^^^-* I] < 1 and E[|X|] < oo; 
in other words, C{X) — F, where F is the unique solution in D to ^'(i^) = F. 
Then 



(21) 



E[X] 



E[T] 



Moreover, if further p] <1 andE [T^] < oo, then E [X'^] < oo and 



(22) Var {X) 



1-^,E[AW2] 



Proof Taking expectations in lO we obtain E[X] ^ IE [A^^^] E [X] + E[T], 
which yields |(2U • 

For the second part, let D2 — {ij, E D : J xdfi{x) — E [X] , J d^{x) < 00}. It 
is easy to see that now ^I^ is a strict contraction in D2 with the d2 metric; hence ^' 
has a unique fixed point in Z?2- Since D2 C D, this fixed point must be F, which 
shows that E [X^] < 00. If we square p9|) and take the expectation, we obtain 



[X^] = E 



i) 2 



[X-] 



f E E 

i<i^j<i 
I 

2^E \^A^' 



{E[X])' 



:[X]+E [T^] 



which yields . 



□ 



5. Proofs of Theorems 12. II and 12.21 
We apply Theorem l4.2l to the distributional identity p2|l . with 1 = 2, 

7(1)^ _ M Z - 11 



and 



^4(2) z(2)^ - (B n- Z 



Here the distribution of (^An}p, Zn,p^ does not depend on i. We verify the as- 
sumptions ii)-iv) of Theorem 14. 21 for Theorems 12 . II and 12 . 21 together : for the second 
theorem take c = 0. The first assumption holds true by Proposition 13.51 

Verification of the second point. We have 

=A = [(l-2c)C/ + c]2, 
= B = [(2c-l)C/ + l-c]2, 

and c e [0, 1]. Easy computations give 



E[[(l-2c)C/ + c]2] +E[[{2c-l)U + l~cf] = ^{1 - 



2 

< -. 

- 3 



16 



ALONSO, CHASSAING, GILLET, JANSON, REINGOLD & SCHOTT 



Verification of the third point. We must prove the convergence of An,p, Bn,p 
and Tn,p to A, B and T(c, U), in . Recall 

From Proposition 13 . 21 we know that, conditioned on ?7, S£ ~ Bi([nt/] — l,p) and 
thus 

E((s£ - ([nC/] - l)pf I [/) = ([nC/] - l)p{l - p) < np. 
Hence, taking the expectation, 

E{se - ilnUI - < 

and thus 

\\se - nC/p||2 <\\se- ([nC/] - l)2j||2 +p < (npy/^ +p< 1{npfl'^. 
Consequently, 



(23) 



Vc 



< 



So 

— ~Up 

n 



and, similarly but more sharply. 



(24) 

Similarly. 
(25) 
and 
(26) 



si 



(l-C/)c 



From (jni, (EH and (123 follows 



(l-C/)^/^ 



0. 



(27) 



^"■^ ^ -(t/-[/c+(l-L/)c) 



0. 



It follows easily from Cauchy-Schwarz's inequality that multiplication is a contin- 
uous bilinear map x ^ L^. Hence (|27|) yields 



Ah 



(^^)'-(C/-C/c+(l-C/)c)^ 



0, 



verifying the first assertion. H27I) similarly implies ||i?„,p ^ ^lli ^0 too. 
For Tn^p we first observe that, similarly, from (|24ll and H26|) . 



^5 (7(1 - (7)c 

n-'p 



0. 



Moreover, since np oo, l|23(l and (|25(l imply ||sf/n^p||j^ < Hs^/n^pH^ ~* and 
||sr/"-^p||]^ 0. 

For the terms and we use Proposition 13.11 We have \\Sm — 'mp\\2 — 
\J nip{l — p) and thus, uniformly for < m < n. 



< 



1 



n^/p n 

which, using Cauchy-Schwarz again, yields 

Sm{Sm + 1) C /"m\ 2 



\Vp-Vc\^o, 



(28) 



2'n?p 



2 V 71 
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Moreover, let W^^ = J^Zi Then EVl^, 



/ m(m+l)p 



and 



\WL 



EW:jI = Var (M^;,) = ^ i'p{l - p) < m'p, 



and thus 
(29) 



W 

rfip 



1 /mN 2 

2 W 



< 



1 



2n 



Proposition 13 . II now yields, by (|28|l and (|29() . uniformly for m < n, 

Wrr ^ ^ r / m \ 2 



1 — c /m 
2 



0. 



0, 



Consequently, using Proposition 

in?p 2 

v?p 2 ^ ' 
Collecting the various terms above, we find ||r,i.p — THj^ — > 0. 

Verification of the fourth point. As already noticed at the beginning of the 
Section, the distribution of (^An}p, Zn,p^ does not depend on j G {1, 2}, so in order 
to prove the two theorems, we only have to check that the fourth assumption holds 
for j = I, for an arbitrary set in each of the two filters: 

[\A„,p\ ; iZn,p~l,p)(^V] =0, yV e e {1,2}; 



(30) 



lim 



also, the expectation on the left hand side of (|30|l is decreasing in V, so we need 
only to check H3()(l for typical elements of the filters' basis. But for {n,p) G Vn.c 
(resp. for (n,p) e Vn.e), 



[| Ajip I ; {Zn^p-l,p)^VN,e] < 



N ~ 1 



n 



E 



< 



N 
np 



6. Proofs of Theorems 12.41 and 12.61 

The proof of these theorems is done in four steps: 

(i) We prove that X{X) defined at is almost surely finite, and has expo- 
nentially decreasing distribution tail. Thus it has moments of all orders. 

(ii) With the help of a Poisson point process representation of Quicksort, we 
prove the convergence of certain copies of Xn p to a copy of X{X) for the 
norm \\-\\i. This entails the weak convergence. 

(iii) We prove that X{X) satisfies the functional equation lO, and that (0) has 
a unique solution under the extra assumptions in Theorem 12.61 

(iv) We compute the first and second moments of X(X), as required for the 
proof of Theorem 12.41 and we also give an induction formula for moments 
of larger order. 
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Some properties of X(X). In this Section, we prove some properties of tlie family 
of random variables {X{X))^yQ defined by Q). Recall that the increasing sequence 
(^'i:j)o<j<2'" defined by the recurrence relation Q, splits [0,1] in 2'' intervals, 
obtained recursively by breaking each of the 2*^'^^ intervals of the previous step into 
two random pieces. For fc > and 1 < i < 2*^, let 

Wk,i = Yks - YkA-i, 
Mk ~ max jwfe^i : 1 < » < 2*^1, 



1 + a 



l<i<2'= 

J'k = CT {Y^J : i < fc, 1 < j < 2' - 1) 



We begin with a simple estimate (see also |7]): 



Lemma 6.1. E 



= 3 



Proof. The length w^.j = \Ik,j\ is the product of k independent random variables. 



each uniform on [0, 1]. Hence E 



□ 



Lemma 6.2. For a > 0, (-Ffe,a)j,>o ^ J-^ -martingale, and E [-Ffc^a] — 1. 
Proof. Clearly E [i^o.a] = 1- Also: 

k+l 2*= 

/ I -I- /T/ \ 

E[Fk+l,a\J'k] - 



1 + a 



2 

1 + a 
2 

1 + a 



i=l 
k+l 2*= 



EEK+1,2.-1+<+1,2.N 

j2wi,E [c/^,, + (1 - c/^r] 

i=l 



i; 2 



□ 



Let p = 0.792977 . . . denote the larger real solution of the equation p ^ 
— 2eln/3. Lemma 16.21 entails that 

Lemma 6.3. E [Mfc] < p'' . 

Proof. Clearly, 

2 



thus, for a > 1, 

E[Mfe] < (E[Af^])^/" < 



1 + a 

k/a 



Fk,, 



1 + a 



(E [Fk^a.])'^'' = 



1 + a 



k/a 



1/q 



The rate (^j:^ j reaches its minimum for 1 + a = 4.311 . . . , a constant that is an 

old friend of Quicksort and binary search trees 0. This leads to the desired value 
for p. □ 
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A weaker form of this inequality (for a — 2), actually sufficient for our purposes, 
is given in ^7.. The sequence {Fk.a)^.^^ is a specialization of martingales that are of 
a great use for the study of general branching random walks, see for instance 
of which binary search trees are a special case I12| . 

Lemma 6.4. E[X(A)] = 1. 

Proof. Set Too = c (Xfcj , ^ > 0, 1 < j < 2'' - l). Inspecting we see that 



k>l 

because, conditionally given J-oo^ the expected number of points of Ilfej- is Xwk.j 
and each of them has an expected contribution Wk.j/{2\) to X(X). □ 

As a consequence of Lemma 16.31 we have 

Theorem 6.5. For each fixed A > 0, the distribution tail P(X(A) > x) decreases 
exponentially fast. 

Proof. Equivalently, we prove this result for S(A) = XX{X). Since 

\X - YkJ^(a;) \ < Mk, 

we have 

S(A)< Y Mk^Y^kMk, 
(fc,x)Gn fe>i 

where Nk = \llk\ is a Poisson random variable with mean A. We split the tail of 
this bound on S(A) as follows: 



'{^{X)>x)<Fij2NkMk > 



\k>l 
< Pi +P2, 



in which 



Pi=P| Y NkMk>x/2 

l<k<7n I 

J2NkMk>x/2y 



P2 = r\ 

We have, by the standard Chernoff bound for the Poisson distribution, 
Pi < P Y ^k>xl2\< exp (1 - ln(a;/2TOA)) - nX 

\l<k<m j 

the last inequality holding only for m < j^. Also 

P2 < P ( V A^fc//^ > x/2 ) + P (3fc > m : Mfc > p''/^ 



(j2^kp'/'>x/2\+F{l 

\k>m / 



^ 2A\ ^("+1)/^ 
X J 1^^- 
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using a Markov first moment inequality to bound both terms. For any a in (0, 1), 
the choice ni ^ ^ leads to an exponential decrease of the tail. □ 



Convergence of Xn,p to X{X). We assume that the input list for Quicksort 
contains the integers {l,2,...,rt} in random order. We model our error-prone 
Quicksort as follows using the variables Ukj and 11 in Section [21 but with the 
intensity A of 11 replaced by \(n,p) = — nln(l — p): 

In the first step, we use the pivot pi^i = [nC/o,il and let for each i (except the 
pivot) there be an error in the comparison of i and the pivot if Hi n (^^, ^] 7^ 0- 
(Recall that life = {x : {k,x) € 11}.) Note that our choice of X{n,p) yields the right 
error probability p. 

Let p'l I be the position of the pivot after the first step. (This position was 
earlier denoted Z„ it may differ from pi i because of errors.) The items of the 
left sublist will thus be placed in positions 1, . . . ,p'i i — 1 and those in the right 
sublist in positions p[i + 1, . . . ,n. Let p'l — and p'12 = 1 + n. 

When the fc-th step begins, we have a set of 2*^"^ subHsts (^fc-i j)j^i 2*=-! ' 
elements of ik-ij being in positions p'k^ij^i + 1, . . . ^Pk-ij ~ 1; J = 1^ • ■ ■ = 2*''"^ 
(with the convention that the sublist is empty when p'i^_i j — p'j,_i < 1). In 
each nonempty such sublist we choose as pivot the item with rank \Uk~i,j{p'k-i j ~ 
p'k-i j-i ^ 1)1 1 ill Itiis sublist, so that its position in the final output will be exactly 



(31) 



Pk,2j 



in case no errors occurs while processing the sublist. We assume an error is made 
when comparing the element at position i with the pivot pk,2j-i if Life n {^—^, — ] ^ 
0. Let p'k2j ~ P'k-ij- l^^t p'k2j-i 11^'^ position of the pivot pk^2j-i after the 
comparisons (as in the first step, p'j^ 2j-i may differ from pk,2j-i because of errors); 
let p'j. 2j_i — p'k-i j if sublist was empty. Set 



Vkj = Pkj/n and 



y'k,j 



P'k,jn. 



We expect yk.j and j to converge to Ykj as n ^ +00. 

This procedure (stopped when there are no more nonempty sublists) is an exact 
simulation of the erratic Quicksort, so we may assume that I{n,p) is the number of 
inversions created by it. As in Section|31 let l'^^\n,p) be the number of inversions 
created at step fc, so 

00 

I{n,p) ^^I^^\n,p). 

fc=i 

We will prove that, using the notation of 0, 



(32) 



5n,k — 



^I^^\n,p) 



ri^p 



\{n,p) 



E E 



Xk,j I 
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for each k. Since also, by Lemmas 13.41 and IHTI 



Sn,k < 



E 



I^'\n,p) 



X{n,p) 



E 



2" 



-E 



2k 



9 ""h 



it follows by dominated convergence that, using Q, 

oo 

\\Xn,p - X{\[n,p))\\^ < y^Jn,, 



0. 



Moreover, A(n,p) — > A, and it follows easily from ^ that ||X(A(n,p)) — X(A)||-|^ 
0. Hence we have E |X„.p — — > 0, which proves the convergence. 
It remains to verify Set 

j=i xeiikj 

Relation H32|) is equivalent to 



(33) 



0. 



For simplicity, we write in the sequel A instead of A(n,p). We begin with a lemma. 
Lemma 6.6. For each k and j , 

( ) k{l + A) 
max|||Yfej - y'kjW^, \\Yk,j - yfc,j|li| < • 

Proof. Recall that p'^ ^ = ny'^ j , so (I31f) translates to 

Pk,2j-i = nyLij-i + \Uk-i,jinyk^ij ~ ny'k_j^j_^ - 1)], 

We use induction on k. Comparing the definitions of Y^j and y'^j, we see that it 
suffices to consider an odd j = 21 — 1, and in that case there are three sources of a 
difference: 

(i) The differences between y[._i i_i and Yfe-i,;-! and between y^_]^ ; and 
Yk-i I- By the induction hypothesis, this contributes at most {k — 1)(1 + 
A)/n. 

(ii) The — 1 inside (and the rounding by) the ceiling function. This contributes 
at most l/n. 

(iii) The shift of the pivot, from pk,2j-i to p'k2j-i^ caused by the erroneous 
comparisons. The shift is bounded by the total number of errors at step 
k, so its mean is less than A, and the contribution is less than X/n. 

□ 

We return to proving H33|) . For k = 1, l'-^'>{n,p) is just t{n,p) studied in Section 
El and lini) yields 

l'-^\n,p) = StSr + Si + Sr + Wp. 
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Let El be the set of items i such that an error was made in the comparison with 
pi.i. Relation Hll|l entails that 



E 



Pi 



l| ^W^+Wp+ ^eisi + l) + ^ri-Sr + 1). 



We shall denote this last sum 

Thus, we have 

/(^)(n,p) - i'^^\n,p) = \stSr + St + Sr- ^Siisi + 1) - ^SriSr + 1)| < sj 

Furthermore 

(34) E [s^ I [nC/i,i] = m] = (m - l)p(l - p) + ((m - l)^)^ < np 



2 2 

n p . 



Hence, 



/(i)(n,p)-/(i)(n,p) =0(1) 



Moreover, = EisEi I ^-yi,i I differs from = Ej=i E^reni,, 

in (|33|) in four ways only (recall that xi^i = xi^2 — ^1,1): 

(i) i/n differs from x by at most 1/n. Since the expected number of terms is 
not larger than A, this gives a contribution O (1/n). 

(ii) |t/i.i — xij\ = \yi^i — ^1,1!, which by Lemma It). 61 has expectation O (1/n). 
Thus this too gives a contribution O (1/n). 

(iii) If there are two or more points in Hi n (^^, ^] for some i, X^^^ contains 
more terms than ^I^^^{n,p). It is easily seen that the expected number 
of such extra points in each interval {^-^, ^] is less than (A/n)^, and each 
point contributes at most 1 to X^^\ 

(iv) Each point in nin( ^^'^~"'" , contributes for an extra term in X^-^'' again. 
The expected number of such extra points is A/n and each of these terms 
contributes at most 1 to X^^^. 

This verifies for fc ^ 1. 

For fc > 2 we argue similarly. We can approximate l'^^\n,p) by the sum of the 
distances between the errors and the respective pivots, 



/W(n,p)= 



E 



as follows: Let Ekj be the set of items i G ik,j subject to error when compared 
with Pfe+i.2j-i, and let Qk be the tr-algebra generated by (f^^j")f<fc j<2* ^^'^ I^i U 
112 U • • • U nfc_i. As for fc = 1, using relation we obtain the following bound: 



E 



I^^\n,p)-i^^\n,p) 



< 



< 2 



(p2 (#4_i,^-)'+P#4.-ij 



E 

j<2'"-i 

^-^(nV + np) =0(1), 



and as a consequence. 



/"'■)(n,p) -/W(n,p) =0(1) 



Now, 



2;fe,2j--l 

n 
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differs from X^'''' = X]j=i J^xeUk 1^ ~ Xk.j \ in H33() in the same four ways as for 
k — I, plus an extra fifth way: 

(i) See the case k — 1. 

(ii) \yk,2j-i-Xk,2j-i\ = \yk,2j-i-Xk.2j\ = |?/fe,2i-i-yfc,2i-i I, which by Lemma 
16.61 has expectation O {1/n). Thus this too gives a contribution O (l/n). 

(iii) Two or more points in 11^ n (^^p, ^] for some i, see the case k — 1. 

(iv) Each point in 11^ H + yk,2j-i, yk,2j-i] contributes for an extra term 
in X'-'^\ The expected number of such extra points is Xl^^^ jn and each 
of these terms contributes at most 1 to X'^^^ . 

(v) There is a new source of error in this approximation, because some points 
X in Ilfc and the corresponding positions i = \nx\ belong to subintervals 
that do not correspond to each other, because the endpoints y'k^i j differ 
somewhat from Yk-ij. By Lemma Ifi. 61 the expected number of such cases 
is O so again we get a contribution of order O (1/n) only. 

This verifies (I33|) and thus the convergence of Xn,p to ^(A). 

The distributional identity for X{X). We check that X{X) satisfies the distribu- 
tional identity and some side conditions needed for the computations of moments. 

Proposition 6.7. {X{\))^yQ is a solution of (|5|. Moreover, E[X(A)"] < oo and 
A"E [X(A)"] -^0 as X^O, forn>l. 

Proof. All moments are finite by Theorem 16.51 Moreover, E[(AX(A))"] — > as 
A ^ by l(Hl and dominated convergence. 

For a < b, let n(a, 6) be a Poisson point process of intensity A on N* x [a,b], 
and let {Ukj ■ k > 0,1 < i < 2^^} be independent uniform random variables as in 
Section|21 and further independent of n(a,6). Define {Ykj ■ k >0,1 <i < 2*^} and 
Jk{x) as in Section |21 with the slight modification 

^0,0 = a and Fo,i = ^ 

and set 

X{X,a,h)^- ^ \x -Yk^j^(^^)\. 

{k,x)en{a,b) 

Note that X{X, 0, 1) = -'^(A). Shifting and rescaling n(a, 6), we obtain 

X{X,a,b) '= X{X,Q,b-a) {b - af X {X{b - a)) . 
Let us split X{X): we have 



X{X) 


= Xo(A)+Xi(A) 


+ ^2 (A) 


AXo(A) 


- E 1- 

(l,a:)en(0,l) 


■Yi.i\ 


AXi(A) 


- E 1- 

(fc,a;)en(0,l) 


■^,Jfc(a;) 1 : 


XX2{X) 


= e' 1- 

(fc,x)en(o,i) 
fc>2,!ii>ri_i 


'Yk,Jk(x) 1 ■ 
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We see, using general properties of Poisson point processes and the recursive con- 
struction of {Ykj : fc > 0, 1 < i < 2*^}, that 

{Xo{x),x,{x),X2{x)) '= (e(A,Yi4),x(A,o,ri4),x(A,yi,i,i)) 

(e(A, Fia), ri^iX(AYia), (1 - Y,,,fX{X{l - Fia))), 

in the sense that, conditionally given that Yi^i = u, Xo{X), ^i(A) and X2{X) are 
independent and distributed as 6(A,m), m^X(Au), {1 — u)^ X {X{1 — u)) , respectively. 
Also Yi.i = [/o.i is uniformly distributed on [0, 1]. □ 

Uniqueness of solutions of (0). Let /i = (ma)a>o ^-nd ^ = (^a)a>o solu- 
tions of 13) in Let Y = {Y{X))^-^q and Z = {Z{X))^^q denote two measurable 
processes representing respectively ^ and d, in the sense of Remark 12.51 (i). With- 
out loss of generality, we can assume that Y and Z share the same underlying 
probabilistic space, and the same exponent a. Then, by definition of M, for A > 0, 

dA{Y,Z) = supE[A"|y(A)-Z(A)|] 

(0,A) 

is finite. Let 5 denote the infimum of d\{Y, Z) over all couples of representations 
(y, Z) of PL and 9, lying on the same probabilistic space, and assume that 5 > Q. 
Let (loi-^o) be such a couple of representations, satisfying furthermore 

dA(ro,^o)<<5^. 

Consider a probabilistic space on which are defined three independent random vari- 
ables (Yi, Zi), (Ya, ^2) and U, {Yi,Zi) and (^3, ^2) being two copies of (Fq, ^0), C/ 
being uniform on (0, 1). Finally, for every A > 0, set 

Y{X) = U^Yi{XU) + (1 - UfY2{X{l - U)) + e(A, U), 

z{\) ^ u^Zi{xu) + (1 - ufZ2{x{i - u)) + e(A, u). 

Then Y and Z are representations of ji (resp. 6) and satisfy Remark 12.51 (ii). 
Moreover, we have, for A G (0, A), 



A" 



f (A) - Z{X) = E [A" I C/2 Yi (At/) + (1 - [7)2^2 (A(l - f/)) 

-U^Zi{XU) - (1 - UfZ2{X{l - U))\ 

< 2E [A"t/2 \Yi(XU) - Zi{XU)\] 

< 2 / u^-" E [{Xu)" I Yi (Am) - Z^ {Xu)\] du 



leading to a contradiction. 

Moments of AT (A). The aim of this Section is the computation of moments of 
A (A), completing the proof of Theorem 12.41 If one uses directly lO), the compu- 
tations of moments by induction are hardly tractable because all three terms on 
the right of jSJ depend on U. To circumvent this problem, we consider a new 
distributional identity 

(35) W{X) C(A) + UW{XU) + (1 - U)WiX{l - C/)), 
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in which 

• ^(A) is as in Section 12 equivalently, ^(A) = J2xeni ^! 

• and (U,W{XU),W{X{1 - U))j are independent; 

• conditionally, given U = u, W{\U) and Vl^(A(l — U)) are independent and 
distributed as W{Xu) and W{X{1 ~ uj), respectively. 

The next Propositions establish relations between X{X) and solutions of H35|l . even- 
tually providing an algorithm for the computation of moments of X{X) (see H36|l 
and jSni). 

Proposition 6.8. The family {Y{X))-^^q = i^W + '^^('^))a>0' which ^(A) and 
X{X) are assumed independent, is a solution of (I35|l . 



Proposition 6.9. The n-th moment of Y{X) is a polynomial of degree n in the 
variable X. 

Before proving Propositions 16 . 8l and 16 .91 we need a lemma. 

Lemma 6.10. The n-th moment gn{X) = IE [C(A)"] is a polynomial of degree n with 
nonnegative coefficients and for n>\, .gn(0) = 0. 

Proof. Owing to Campbell's Theorem p. 28], we have 



E 



= exp(A(E[e/^]-l))=exp(^A(^J + |^ + ., 

Expanding the last expression gives the lemma. □ 
Proof of Proposition 1 6'. 61 To show (I35|l . it is enough to show 
XX{X) '= UY{XU) + (1 - U)Y{X{1 ~ U)) 

'= U^{XU) + XU^XiXU) + (1 - C/)?'(A(1 - U)) + A(l - UfX{X{l - U)), 

where, as usual, conditioned on U = u, the terms on the right hand side are 
independent with the right distributions. This follows immediately from (jSJ, since 

Ae(A, u) <(Am) + (1 - u)e(A(l - u)). 



□ 



Proof of Proposition 1 6. 91 Consider the sequence of integral equations 

(36) Po(A)-l, P„(A) = 2/ u"P„(AM)du + i/;„(A), n > 1, 

Jo 

in which 

(37) Vn(A)= J2 { l)\9r{X) [ u\l-ufPk{Xu)P,{X{l-u))du, 



k<n,e< 



where gr is the r-th moment of ^(A). Proposition l6.9l is a consequence of the next 
lemma. □ 

Lemma 6.11. The induction formula (|36|l and the initial condition Pi(0) = 
defines a unique sequence of polynomials, (Pn(A))„>o- Furthermore, Pn has degree 
n, and vanishes at 0. For n > 1, the n-th moment E [^(A)"] is equal to P„(A). 
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Proof. Consider n > 1 and assume that the properties in the lemma hold for 1 < 
m < n — 1. Then, for k and £ smaller than n, and r + k + £ = n, the expression 

gr{X) [ u''{l~uYPk{Xu)Pe{X{l~u))du 



Jo 

is a polynomial with degree n and, due to Lemma lG.lOl vanishes at 0. Thus, in this 
case, V'n(A) is a polynomial with degree n, vanishing at 0. It is now easy to check 
that a polynomial -P„(A) satisfies if and only if, for (n, k) ^ (1, 0), 

(38) [A'^] F„ = ^^±^ [A^-] 

n + K — 1 

Also, by the induction assumptions. 



, \Y{\T\ = E [(e(A) + UY{UX) + (1 - [/)y((i - u)x)y 



r+k+e=n ' ^ 

2E [C/"y(AC/)"] + Vn(A). 



Note that Vn(A) > for A > 0. By Remark ESI A /„(A) = E[r(A)"] is 
nonnegative and measurable. Thus, for A > 0, we can rewrite the previous equation: 

/„(A) = 2 / «"/„(Au)dM + i/.„(A) 
Jo 



2A-"-^ / v''fniv)dv + MX)- 







Since /n(A) is assumed to be finite and tpniX) > 0, the integral on the right hand 
side is convergent, and thus it is a continuous function of A. As a consequence /„ 
belongs to C°°(0, +oo), and is a solution on (0, +oo) of the following differential 
equation: 

A/4(A) + {n- 1)/„(A) = {n + 1)^„(A) + AV-^A). 

by Proposition 16.71 and Lemma [fi.lOl A"^^/„(A) ^ as A ^ 0, but the general 
solution of the differential equation is P„(A) + C A^"+^. Thus /„ = P„ on (0, +oo). 

□ 

As a consequence of these results, we deduce that: 

Proposition 6.12. The function X — > A"E[Ar(A)"] is a polynomial of degree n 
that vanishes at 0. 

Proof. Since ^(A) = ^(A) + XX{X), with independent summands, we obtain 
(39) P„(A)=E[y(A)™]= (lYlaxr^^jX^ElXiXY]. 

0<k<m ^ ^ 

The result follows by induction. □ 
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Computation of the first moments. The moments of Y{X), and thus of X{X), 
can be computed up to arbitrary order with the help of H36|l and ()39|l . For the first 
two moments, the calculations run as follows. Expanding 



in the proof of Lemma 16.101 we obtain 
Lemma 6.13. 

gi(A)=E[e(A)] = iA and 52(A) = E [^(A)^] = Ia + ^A^. 
Proposition 6.14. 

AE[X(A)] =E[S(A)] = A and A^Var (X(A)) = Var (S(A)) = ^A + ^A^. 

Proof. Taking n = 1 in l|37|l and (|38|l . we find, using Lemma 16.131 

i^i(A)= 51(A) = iA, 

Pi(A) = f -iA^fA. 
Taking n = 2, we similarly find 

V'2(A) =52(A) + 2 • 251(A) /" uPi{Xu)du + 2 [ u{l ~ u)Pi{Xu)Pi{X{l ~ u)) du 

Jo Jo 

_ 1 \ I 7 ),2 

-P2(A) = I • iA + I • |A^ = |A + |A^. 
Since Y{X) = ^(A) + XX{X), with independent summands. 

Pi (A) = E [Y{X)] = E [e(A)] + AE [X{X)] , 
which by Lemma [6 . 1 31 yields AE [X(A)] = A. Similarly, 

A^E [XiXf] = P2(A) - E [e(A)2] - 2E [^(A)] E [XX{X)] = iA + if A^, 
which yields the variance formula. □ 

The formulas for mean and variance of X{X) can also be obtained directly from 
(jSJ and Lemma [6. 131 we leave this as an exercise. 

7. Concluding remarks 

We have presented a probabilistic analysis of Quicksort when some comparisons 
can err. Analysing other sorting algorithms such as merge sort, insertion sort or 
selection is even more intricate. They do not fit into the model presented in this 
paper and further more involved probabilistic models/arguments are required. We 
conjecture that the same normalization holds for the number of inversions in the 
output of merge sort for n = 2™ — > +cx3, p = X/n, and that the limit law X(X) 
satisfies 



E 



^(^) I - E (2^ + 2)(2'' + 3) = 0-454674373 • • • < E [X{X)] 
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